Skip to content

Conversation

@tinque
Copy link
Contributor

@tinque tinque commented Oct 6, 2025

Implements #7809 and ports #7822 into the @langchain/aws library

@changeset-bot
Copy link

changeset-bot bot commented Oct 6, 2025

🦋 Changeset detected

Latest commit: 3181380

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@langchain/aws Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link

vercel bot commented Oct 6, 2025

@tinque is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

@vercel
Copy link

vercel bot commented Oct 6, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Updated (UTC)
langchainjs-docs Ignored Ignored Oct 6, 2025 4:54pm

@tinque tinque force-pushed the application-inference-profile branch from 5982b00 to 2f5c28d Compare October 6, 2025 16:54
@christian-bromann
Copy link
Member

Really appreciate you taking the time to open this PR, @tinque 🙏
The team is currently focused on shipping our v1 release, so we’re pausing detailed reviews for a bit. We’ll follow up within the next 2–4 weeks once things settle — thanks so much for your patience!

@tinque
Copy link
Contributor Author

tinque commented Oct 9, 2025

I totally understand @christian-bromann that the team is focused on the v1 release — congrats on that milestone! 🎉

That said, this change is quite minor and already covered by unit tests.
Would it be possible to get a quick review or exception on this one? It’s currently blocking our production deployment on our side, so even a short-term workaround or early merge would be extremely helpful.

Thanks again for your time and all the great work you’re doing with LangChain!

@tinque tinque force-pushed the application-inference-profile branch from 2f5c28d to 40befc0 Compare October 16, 2025 08:56
Implements #7809 and ports langchain-ai#7822 into the @langchain/aws library
@tinque tinque force-pushed the application-inference-profile branch from 16d3417 to 2ed02b2 Compare October 16, 2025 11:26
Copy link
Member

@christian-bromann christian-bromann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally been able to take a look at this. What do you think to just extend the documentation for model to hint users that they can also pass in a Application Inference Profile ARN? Two concerns I have with this approach:

  • maintaining an additional field that represents the model in some cases
  • can we guarantee that the model defined behind the profile is the same specified in model? Should we check for that?

Thoughts?

@tinque
Copy link
Contributor Author

tinque commented Oct 29, 2025

Thanks Christian! 👋

Good points — here’s some context on that:

About maintaining an additional model field
That’s indeed the tricky part. If we only pass the inference profile ARN, it works fine for inference, but we lose the actual model name in the metadata.
That means we can’t properly track cost or latency per model in LangSmith for example — which is why keeping an explicit model field is useful.

About guaranteeing that the profile and model match
Unfortunately, there’s currently no API to “describe” an inference profile and retrieve the underlying model.
The profile ID is opaque, so we can’t programmatically validate that it matches the model field.

That’s why, for now, the safest approach is to let users explicitly define both — model (for metadata/tracking) and inferenceProfileArn (for execution).
Once AWS exposes an API to resolve profiles, we could definitely add a validation layer.

For context, this implementation follows the same approach as #7822, which was based on the discussion in #7809.

@tinque
Copy link
Contributor Author

tinque commented Nov 4, 2025

Hi 👋
Just a friendly reminder about this PR — is there anything I can do to help move it forward?
It’s aligned with the approach discussed in #7822 and #7809.
Thanks a lot for your time!

@chabli
Copy link

chabli commented Nov 6, 2025

I'm also interested
Thank you

@callmeGillou
Copy link

Hey there,
Any news on this topic? Would love to implement it asap as we didn't find any workaround with LangSmith.
I'd appreciate your support. Thanks

@chabli
Copy link

chabli commented Nov 6, 2025

It's a must have to use langsmith!

Copy link
Member

@christian-bromann christian-bromann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tinque thanks for pinging. I think your comments make sense. I am ok moving forward as is. One request though: can we add a section to the README.md on using application inference profiles and the implications we discussed in this thread?

@hntrl any concerns?

Copilot AI review requested due to automatic review settings November 10, 2025 18:35
@tinque
Copy link
Contributor Author

tinque commented Nov 10, 2025

Hey @christian-bromann !

Added documentation for the inference profile feature in the README. It covers the usage and explains why we need both parameters for proper metadata tracking.

Check out the latest commit and let me know if you'd like any changes! 👍

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for AWS Bedrock Application Inference Profiles to the ChatBedrockConverse class, allowing users to route inference requests through custom endpoints that can manage cross-region traffic.

  • Adds applicationInferenceProfile parameter to override the model ID in API calls while preserving metadata tracking
  • Updates both streaming and non-streaming code paths to use the inference profile ARN when provided
  • Includes comprehensive test coverage for the new functionality

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
libs/providers/langchain-aws/src/chat_models.ts Adds applicationInferenceProfile property and logic to use it as modelId in ConverseCommand and ConverseStreamCommand when provided
libs/providers/langchain-aws/src/tests/chat_models.test.ts Adds comprehensive test suite covering initialization, command creation with/without inference profile for both streaming and non-streaming modes
libs/providers/langchain-aws/README.md Documents the new Application Inference Profiles feature with usage examples and important notes about model metadata tracking
.changeset/wet-taxis-heal.md Adds changeset entry marking this as a minor version feature addition

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@christian-bromann christian-bromann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@hntrl thoughts?

@tinque
Copy link
Contributor Author

tinque commented Nov 17, 2025

Hi @christian-bromann @hntrl
Just a friendly reminder about this PR — is there anything I can do to help move it forward?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants